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PART III. 



DETAILED ACTION 



Claims 1-27 are presented for examination. 



Claim Rejections - 35 USC § 103 



2. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 



3. Claims 1-10 and 12-27 are rejected under 35 U.S.C. 103(a) as being unpatentably over 
Dakss et al. (USPN: 6,642,940) hereinafter Dakss in view of Ratakonda et al. (USPN: 
5,956,026). 

As per claims 1 (method) and 18 (system), Dakss discloses a method for skimming video 
data wherein the video data comprises a plurality of scenes as the technique of Object 5 Sandra 
Hair and Object 6 Cecil Suit (see Fig. 2), comprising the steps of: 

Obtaining a plurality of shots for each scene using a shot segmentation and forming a 
structure information index corresponding to each shot is taught by Dakss as the technique of 
Object 5 includes Shots 22 and 26, Object 6 includes shots 23 and 25 (see Fig. 2) and each frame 
of a shot is analyzed and segmented into regions, these region identities through the action of the 
author, who identifies and labels them (see col. 3 line 66 to col. 4 line 2); 
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Selecting at least one shot from each scene based on the structure information index is 
taught by Dakss as the technique of each frame of a shot is analyzed and segmented into regions, 
these region identities through the action of the author, who identifies and labels them (see col. 3 
line 66 to col. 4 line 2); 

Dakss, however, does not disclose the limitations of selecting at least one section from 
the selected shot and reproducing selected sections from each scene to skim the video data. 

Ratakonda discloses the limitations of selecting at least one section from the selected shot 
and reproducing selected sections from each scene to skim the video data as the technique of 
determining the number of keyframes to be allocated within each shot (see col. 2, lines 19-21) 
and constructing a hierarchical summarization with multiple levels wherein levels may in terms 
of detail of frames (see col. 2, lines 31-33). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teachings of selecting at least one section from the 
selected shot and reproducing selected sections from each scene to skim the video data into that 
of Dakss 5 invention. By doing so, the system would be enhanced by allowing user to select any 
number of keyframes or section within the selected shot prior to reproduce a video sequence. 
Thus, the system would provide an enhanced editing tool to its end user. 

As per claims 17 (method), 22 (method), and 27 (system); due to the similarity of each of 
these claims to that of claim 1; these claims are therefore rejected for the same reasons applied to 
claim 1. 
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As per claim 19, due to the mostly similarity of this claim to that of claim 1, except for a 
video skimming apparatus for searching and browsing digital video data comprising of a user 
interface for inputting an external control information. The limitation of a video skimming 
apparatus for searching and browsing digital video data comprising of a user interface for 
inputting an external control information are taught by Dakss as the technique of HyperActive: 
An Automated Tool for Creating Hyperlinked Video (see col. 4, lines 26-27), which facilitates 
rapid index searching to identify previously classified objects as candidate matches to a new 
object (see col. 3, lines 3-5), and user interface 550 generates words or graphic images on display 
534 to prompt action by the user, and accepts user commands from keyboard 530 and position 
pointing device (see col. 10, lines 51-54). This claim is therefore rejected for the reasons as set 
forth above. 

As per claims 2 (method) and 23 (method), the limitation of structural information index 
includes at least one of scene information, shot information and temporal information is taught 
by Dakss as the technique of scene information of Object 5 Sandra Hair and Object 6 Cecil Suit 
(see Fig. 2) and raw video is first "temporally segmented", i.e., broken up into its constituent 
shots. Differences between pixel values in temporally adjacent frames are computed, summed, 
and compared to a threshold value. If the threshold is exceeded, the two frames are likely to be 
on either side of a shot boundary. If not, they are probable both within the same shot (see col. 4, 
lines 30-36). These claims are therefore rejected for the reasons as set forth above. 
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As per claims 3 (method) and 24 (method), Dakss discloses the limitation of wherein the 
scene information includes a logical story unit and the shot information includes a physical 
editing unit as the technique of Scene information of Objects 5 and 6 are stored as level 1 and the 
second level of shot information of shots 22 and 26 under Object 5 and shots 23 and 25 under 
Object 6, respectively, (see Fig. 2). 

Dakss, however, does not disclose the limitation of temporal information includes 
information concerning start and end of each shot. 

Ratakonda discloses the limitation of temporal information includes information 
concerning start and end of each shot as the technique of temporal nature of video sequence (see 
col. 13, line 39) which including the step of Detect Shot Boundary 38 (see Fig. 4) or the user 
may manually specify the beginning and ending frames (see col. 5, lines 39-40). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of temporal information includes information 
concerning start and end of each shot into that of Dakss' invention. By doing so, the system 
would be enhanced by providing more information of shot boundary to its end user. 

As per claims 4 (method) and 25 (method), Dakss discloses the limitation of wherein 
when shots are being selected from each scene, selection of multiple shots having similar 
properties is minimized as the technique of a "shot" refers to a sequences of successive frames 
created by a single camera (see col. 3, lines 63-64), raw video is first "temporally segmented", 
i.e., broken up into its constituent shots. Differences between pixel values in temporally adjacent 
frames are computed, summed, and compared to a threshold value. If the threshold is exceeded. 
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the two frames are likely to be on either side of a shot boundary. If not they are probable both 
within the same shot . Once this process is repeated for all pairs of temporally adjacent 
frames, it is possible to identify and track the objects that appear in each shot (see col. 4, 
lines 30-38). These claims are therefore rejected for the reasons as set forth above. 

As per claim 5, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of shots to be used for skimming are selected by giving 
a higher weight value to shots located at a latter part of each scene. 

Ratakonda discloses the limitation of shots to be used for skimming are selected by 
giving a higher weight value to shots located at a latter part of each scene as the technique of in 
an actual GUI implementation, the children-parent relationships may be expitly indicated during 
display (see col. 5, lines 54-56). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of skimming are selected by giving a higher 
weight value to shots located at a latter part of each scene by parent-child relationship into that of 
Dakss' invention. By doing so, the system would be enhanced by providing more weighted 
information in term of parent-child hierarchical relationship to its end user. 

As per claims 6 (method) and 26 (method), Dakss discloses the invention substantially as 
claimed above. Dakss, however, does not disclose the limitation of when selecting at least one 
section from the selected shot, the selected section is from at least one of front section, rear 
section, center section of the selected shot. 
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Ratakonda discloses the limitation of when selecting at least one section from the 
selected shot, the selected section is from at least one of front section, rear section, center section 
of the selected shot as the technique of the user may manually specify the beginning and ending 
frames (see col. 5, lines 39-40). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of when selecting at least one section from 
the selected shot, the selected section is from at least one of front section, rear section, center 
section of the selected shot into that of Dakss' invention. By doing so, the system would be 
enhanced by providing an enhanced tool for video editing tool to its end user. 

As per claim 7, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of wherein each reproduction length of selected 
sections from selected shots is the same. 

Ratakonda discloses the limitation of wherein each reproduction length of selected 
sections from selected shots is the same as the technique of shot boundary detection 38 is 
performed using a threshold method, where differences between histograms of successive frames 
are compared. Given total number of keyframes 40, each shot is assigned a number of keyframes 
42 depending on the action within the shot, according to well known technique (see col. 4, lines 
51-57). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of wherein each reproduction length of 
selected sections from selected shots is the same into that of Dakss' invention. By doing so, the 
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system would be enhanced by reproducing the video has same length to the selected sections of 
keyframes. Thus, the system would provide an enhanced tool for video editing tool to its end 



As per claim 8, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of wherein if the reproduction length of the selected 
section is larger than a shot length of the corresponding selected shot, then the reproduction 
length of the selected section is decreased to be less than or equal to the shot length. 

Ratakonda discloses the limitation of wherein if the reproduction length of the selected 
section is larger than a shot length of the corresponding selected shot, then the reproduction 
length of the selected section is decreased to be less than or equal to the shot length as the 
technique of Compressed Video Input wherein available video streams are in a compressed 
format for compact storage. The method may be extended to a compressed bitstream in such a 
way as to extract keyframes while performing minimal decoding (see col. 14, lines 15-23). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of wherein if the reproduction length of the 
selected section is larger than a shot length of the corresponding selected shot, then the 
reproduction length of the selected section is decreased to be less than or equal to the shot length 
by technique of compressing video input into that of Dakss' invention. By doing so, the system 
would be enhanced by performing minimal decoding. Thus, the system would be enhanced by 
concurrently increases its speed when the system would minimizes decoding step. 



user. 
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As per claim 9, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of wherein each section comprises a plurality of frames 
and each reproduction length of selected sections from selected shots is chosen in response to a 
dissimilarity factor of neighboring frames. 

Ratakonda discloses the limitations of wherein each section comprises a plurality of 
frames and each reproduction length of selected sections from selected shots is chosen in 
response to a dissimilarity factor of neighboring frames as the technique of if a shot has n frames 
and K frames are to be allocated, every (n/K) th frame is selected as a keyframe (see col. 7, lines 
47-49) and shot boundary detection 38 is performed using a threshold method, where differences 
between histograms of successive frames are compared . Given total number of keyframes 40, 
each shot is assigned a number of keyframes 42 depending on the action within the shot, 
according to well known technique (see col. 4, lines 51-57). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teachings of wherein each section comprises a 
plurality of frames and each reproduction length of selected sections from selected shots is 
chosen in response to a dissimilarity factor of neighboring frames into that of Dakss' invention. 
By doing so, the system would be enhanced by providing an enhanced video editing tool in term 
of comparing a desired keyframe and its successive frames. 

As per claim 10, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitations of wherein the dissimilarity factor is determined in 



Application/Control Number: 09/932,713 Page 10 

Art Unit: 2173 

response to at least one of image, motion and audio similarities in individual shots, and the 
reproduction length of selected section is adjusted in response to the dissimilarity factor. 

Ratakonda discloses the limitations of wherein the dissimilarity factor is determined in 
response to at least one of image, motion and audio similarities in individual shots, and the 
reproduction length of selected section is adjusted in response to the dissimilarity factor as the 
technique of Using Motion Characteristic for Summarization, wherein provide an option for the 
pan frames to be converted into an image mosaic for viewing purposes since detection of pan and 
zoom both involve computing motion vectors (see col. 1 1, lines 27-33) and Compressed Video 
Input wherein available video streams are in a compressed format for compact storage. The 
method may be extended to a compressed bitstream in such a way as to extract keyframes while 
performing minimal decoding (see col 14, lines 15-23). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teachings of wherein the dissimilarity factor is 
determined in response to at least one of image, motion and audio similarities in individual shots, 
and the reproduction length of selected section is adjusted in response to the dissimilarity factor 
into that of Dakss' invention. By doing so, the system would be enhanced by providing an 
enhanced of video editing tool to its end user. 

As per claim 12, due to the similarity of this claim to that of claim 8, this claim is 
therefore rejected for the same reason applied to claim 8. 
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As per claim 13, the limitation of reproduction is varied in response to an external input 
is taught by Dakss as the technique of user interface 550 generates words or graphic images on 
display 534 to prompt action by the user, and accepts user commands from keyboard 530 and 
position pointing device (see col. 10, lines 51-54). This claim is therefore rejected for the reasons 
as set forth above. 

As per claim 14, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitations of wherein the selected sections is reproduced at a 
high speed by increasing a number of frames to be decoded per unit time. 

Ratakonda discloses the limitations of wherein the selected sections is reproduced at a 
high speed by increasing a number of frames to be decoded per unit time as the technique of the 
computation performance of the keyframe generation depends heavily upon the hard disk access 
speed of the computer used to practice the method of the invention . For the example, "real time 
processing" means the ability to process 30 frames per second at a given resolution. For a 300 
frame quarter common intermediate format color sequence (176x144 resolution), it was found 
that construction of the histograms took 1 1 seconds, while the rest of the processing took less 
than a second on a SUN Ultra SPACR-2 , Thus, provided that histogram computation may be 
achieved in real time. It should be easy to achieve real time hierarchical keyframe generation. It 
may also be note that the processing after computation of the histograms is dependent on the 
actual frame resolution, thus the amount taken to process a 300 frame QCIF sequence is the same 
as that of processing a sequence at 1024x780 resolution, provided that the histograms of eah 
frame have been pre-computed (see col. 13 line 59 to col. 14 line 9). 
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It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of wherein the selected sections is 
reproduced at a high speed by increasing a number of frames to be decoded per unit time into 
that of Dakss' invention. By doing so, the system would be enhanced by providing a high speed 
of reproducing keyframe to its system. Thus, the system would save time consumption to its end 
user. 

As per claim 15, due to the partly similarity of this claim to that of claim 9, this claim is 
therefore rejected for the same reason applied to claim 9. 

As per claim 16, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of when the video data uses a coding scheme utilizing 
interframe compression, then I frames are selected for obtaining frame data for decoding only 
corresponding frames. 

Ratakonda discloses the limitation of when the video data uses a coding scheme utilizing 
interframe compression, then I frames are selected for obtaining frame data for decoding only 
corresponding frames as the technique of the video browsing method described herein may have 
applications which go beyond simply providing an effective user interface for multi-media 
manipulation. It provides an understanding of the temporal nature of the video sequence, which 
may be potentially employed in second generation video coding system . A hierarchical of 
keyframes may be used in designing encoders which is intelligently, and more importantly, 
computationally efficiently, adapt to the nature of the temporal video stream thus provide higher 
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quality while utilizing lesser resources. Information on how to utilize a hierarchical of video 
frames in improving compression is available in the literature, where the multi scale nature of a 
segmentation algorithm is exploited to obtain lostless still image compression (see col. 13, lines 



It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of when the video data uses a coding scheme 
utilizing interframe compression, then I frames are selected for obtaining frame data for 
decoding only corresponding frames into that of Dakss' invention. By doing so, the system 
would be enhanced by adapting to the nature of the temporal video stream thus provide higher 
quality while utilizing lesser resources and to obtain lostless still image compression to its end 
user. 

As per claim 20, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitation of wherein the user interface unit comprises a unit for 
designating a summary level as a degree of video skimming. 

Ratakonda discloses the limitation of wherein the user interface unit comprises a unit for 
designating a summary level as a degree of video skimming as the technique of hierarchical, 
multi-level summarization facilitates an effective way of visual interactive presentation of video 
summary to the user. The user may interact with the summary via a graphical user interface, for 
refining the summary, visualizing different levels of the summary (see col. 3, lines 52-56). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of wherein the user interface unit comprises a 



36-52). 
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unit for designating a summary level as a degree of video skimming into that of Dakss' 
invention. By doing so, the system would be enhanced by providing multi-level summarization 
to the user wherein the user may interacts with the summary via a graphical user interface, for 
refining the summary and visualizing at the different levels of the summary based on user 
desired task. Thus, the system would provide an enhanced tool to its end user. 

As per claim 21, Dakss discloses the invention substantially as claimed above. Dakss, 
however, does not disclose the limitations of wherein the control unit reads the structure 
information index related to shot segmentation information and shot clustering information from 
an index file according to a skimming condition by using the external control information, 
calculates segments to be reproduced conforming to the video skimming condition, reproduces 
the corresponding segments from the video data, and outputs to the display unit. 

Ratakonda discloses the limitations of wherein the control unit reads the structure 
information index related to shot segmentation information and shot clustering information from 
an index file according to a skimming condition by using the external control information, 
calculates segments to be reproduced conforming to the video skimming condition, reproduces 
the corresponding segments from the video data, and outputs to the display unit as the technique 
of a video sequence may be indexed on the basis of its summary frames (see col. 4, lines 17-18), 
the hierarchical approach allows the user quickly to browser through a collection of video 
sequences by considering their most compact summaries 22, with an option of accessing a finer 
summary 24, 26, if the content of the most compact summary is indeed interesting (see col. 4, 
lines 22-36), determining the number of keyframes to be allocated to each shot (see col. 6, lines 
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9-10), constructing a hierarchical summarization with multiple levels wherein levels may in 
terms of detail of frames (see col. 2, lines 31-33), and a hierarchical, multi-level summarization 
facilitates an effective way of visual interactive presentation of video summary to the user. The 
user may interact with the summary via a graphical user interface, for refining the summary, 
visualizing different levels of the summary (see col. 3, lines 52-56). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Ratakonda teaching of the limitations of wherein the control unit 
reads the structure information index related to shot segmentation information and shot 
clustering information from an index file according to a skimming condition by using the 
external control information, calculates segments to be reproduced conforming to the video 
skimming condition, reproduces the corresponding segments from the video data, and outputs to 
the display unit into that of Dakss' invention. By doing so, the system would be enhanced by 
providing an enhanced video editing tool to its end user wherein the user can accessing, refining 
the summary and visualizing at the different levels of the summary based on user desired task. 

4. Claim 1 1 is rejected under 35 U.S.C. 103(a) as being unpatentably over Dakss et al. 
(USPN: 6,642,940) hereinafter Dakss in view of Ratakonda et al. (USPN: 5,956,026) and further 
in view of Boezeman et al (USPN: 6,188,396) hereinafter Boezeman. 

As per claim 1 1 (method), Dakss-Ratakonda discloses the invention substantially as 
claimed above. Ratakonda discloses motion in individual shots, and the reproduction length of 
selected section is adjusted in response to the dissimilarity factor as the technique of Using 
Motion Characteristic for Summarization, wherein provide an option for the pan frames to be 
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converted into an image mosaic for viewing purposes since detection of pan and zoom both 
involve computing motion vectors (see col. 11, lines 27-33) . Dakss-Ratakonda, however, does 
not disclose the limitation of wherein the image, motion and audio similarities in the selected 
shot representative of the selected scene includes similarities in frames, motion vectors and audio 
data with different time positions. 

Bozeman discloses the limitations of wherein the image, motion and audio similarities in 
the selected shot representative of the selected scene includes similarities in frames, motion 
vectors and audio data with different time positions as the technique of synchronization of 
Animation, AudioPlay, VideoPlay, and Image in the Sequence Editor versus Time axis (see Figs. 
11-15). 

It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to include Boezeman teachings of wherein the image, motion and audio 
similarities in the selected shot representative of the selected scene includes similarities in 
frames, motion vectors and audio data with different time positions into that of Dakss-Ratakonda 
combined invention. By doing so, the system would be enhanced by allowing user to embed 
audio, video, image as well as animation information corresponding to a particular time factor 
based on user desired task. Thus, the system would provide an enhanced graphical based user 
interface to its end user. 

Conclusion 

5. The prior art made of record and not relied upon is considered pertinent to applicants 
disclosure. Applicant is required under 37 C.F.R. 1.11 1(c) to consider these references fully 
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when responding to this action. The documents cited therein teach a method of using a graphical 
based user interface for accessing, editing and controlling multimedia information programs. 

6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to CUONG T THAI whose telephone number is (703) 308-7234. 
The examiner can normally be reached on 8:00 am - 4:00 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, John Cabeca can be reached on (703) 308-3 116. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 




CUONG T THAI 
Examiner 
Art Unit 2173 y 
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